Reliable High Performance Peta- and Exa-Scale Computing
نویسندگان
چکیده
منابع مشابه
Position Paper: Using a “Codelet” Program Execution Model for Exascale Machines∗
As computing has moved relentlessly through giga-, tera-, and peta-scale systems, exa-scale (a million trillion operations/sec.) computing is currently under active research. DARPA has recently sponsored the “UHPC” [1] — ubiquitous high-performance computing — program, encouraging partnership with academia and industry to explore such systems. Among the requirements are the development of novel...
متن کاملImproving Application Resilience through Probabilistic Task Replication
Maintaining performance in a faulty distributed computing environment is a major challenge in the design of future peta and exa-scale class systems. Better defining application resilience as a function of scale, is a key to developing reliable software systems and programming methodologies. This paper defines the resilience of a task as the survivability of that task (i.e., how well will it sur...
متن کاملLarge-scale Simulations of 3D Groundwater Flow using Parallel Geometric Multigrid Method
The multigrid method used with OpenMP/MPI hybrid parallel programming models is expected to play an important role in large-scale scientific computing on post-peta/exa-scale supercomputer systems. In the present work, the effect of sparse matrix storage formats on the performance of parallel geometric multigrid solvers was evaluated, and a new data structure for the Ellpack-Itpack (ELL) format ...
متن کامل'Mutual Watch-dog Networking': Distributed Awareness of Faults and Critical Events in Petascale/Exascale systems
I. INTRODUCTION Features like resilience, power consumption, and availability of large scale computing system strongly depend on 1-the complexity of individual components (e.g. the gate count of each chip) and 2-the number of components in the system. Exa-scale computing systems and networks of 3G devices are examples of distributed systems composed of a huge number of high complexity individua...
متن کاملOpen-architecture Implementation of Fragment Molecular Orbital Method for Peta-scale Computing
We present our perspective and goals on highperformance computing for nanoscience in accordance with the global trend toward “peta-scale computing.” After reviewing our results obtained through the grid-enabled version of the fragment molecular orbital method (FMO) on the grid testbed by the Japanese Grid Project, National Research Grid Initiative (NAREGI), we show that FMO is one of the best c...
متن کامل